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Abstract. The leukocyte function-associated molecule 
1 (LFA-1, CDlla/CD18) is a membrane glycoprotein 
which functions in cell-cell adhesion by heterophilic 
interaction with intercellular adhesion molecule 1 
(ICAM-1). LFA-1 consists of an a subunit (Af r = 
180,000) and a 0 subunit (M r = 95,000). We report 
the molecular biology and protein sequence of die a 
subunit. Overlapping cDNAs containing 5,139 nucleo- 
tides were isolated using an oligonucleotide specified 
by tryptic peptide sequence. The mRNA of 5.5 kb is 
expressed in lymphoid and myeloid cells but not 
in a bladder carcinoma cell line. The protein has a 
1,063 -amino acid extracellular domain, a 29-amino 
acid transmembrane region, and a 53-amino acid cyto- 
plasmic tail. The extracellular domain contains seven 
repeats. Repeats V-VII are in tandem and contain puta- 



tive divalent cation binding sites. LFA-1 has significant 
homology to the members of the integrin superfamily, 
having 36% identity with the Mac-1 and pi 50,95 a 
subunits and 28% identity with other integrin a 
subunits. An insertion of ~200 amino acids is present 
in the NH 2 -terminal region of LFA-1. This "inserted/ 
interactive" or I domain is also present in the pl50,95 
and Mac-1 a subunits but is absent from other integrin 
a subunits sequenced to date. The I domain has strik- 
ing homology to three repeats in human von 
Willebrand factor, two repeats in chicken cartilage ma- 
trix protein, and a region of complement factor B. 
These structural features indicate a bipartite evolution 
from the integrin family and from an I domain family. 
These features may also correspond to relevant func- 
tional domains. 



The leukocyte function-associated 1 (LFA-1) 1 mole- 
cule is a member of a family of three leukocyte glyco- 
proteins involved in cell-cell adhesion. This family 
of proteins, LFA-1, Mac-1, and pl50,95, are heterodimers 
consisting of distinct a subunits (Af r = 180,000, 170,000, 
and 150,000, respectively) and a common j3 subunit (Af f = 
95,000) (25, 47). NH 2 -terminal sequencing of the a sub- 
units in the mouse and human has suggested that they are 
structurally related (33, 48); however, the cell surface ex- 
pression and function of these molecules differs. LFA-1 is ex- 
pressed on virtually all leukocytes and is involved in a large 
number of adhesion-dependent phenomena. mAb directed 
against LFA-1 inhibit antigen-specific T helper cell function 
and cytolytic functions such as cytotoxic T lymphocyte-me- 
diated killing, antibody-dependent cytoxicity by granulo- 
cytes, and natural killer activity (47). LFA-1 is also involved 



1. Abbreviations used in this paper: CMP, cartilage matrix protein; ECM, 
extracellular matrix; FNR, fibronectin receptor; ICAM-1, intracellular 
adhesion molecule-1; LFA, leukocyte function-associated 1; vWF, von 
Willebrand factor. 



in antigen-independent interactions that mediate cell local- 
ization to sites of inflammation such as leukocyte adhesion 
to endothelial cells, fibroblasts, epidermal keratinocytes, and 
synovial cells (12-14, 17, 30, 53). Mac-1 and pl50,95 are 
expressed on monocytes, granulocytes, and some activated 
lymphocytes, and function as adhesion molecules in cell- 
cell and cell-substrate interactions as well as complement 
receptors for C3bi (1). 

The importance of these three glycoproteins is signified by 
the clinical syndrome known as leukocyte adhesion defi- 
ciency (LAD) (1). The primary defect in LAD occurs in the 
jS subunit common to LFA-1, Mac-1, and pi 50,95 (22) result- 
ing in deficient surface expression of the ajS complexes. 
LAD patients have recurrent bacterial infections which are 
sometimes fatal and their leukocytes are deficient in a wide 
range of adhesion-dependent functions. 

The structure of the jS subunit common to LFA-1, Mac-1, 
and pi 50,95 has been determined (23, 27) and has revealed 
homology to extracellular matrix (ECM) receptors. These 
similarities led to the concept of a family of a/0 heterodimers 
designated the integrins (21, 41). The term "integrin" empha- 
sizes the role of these proteins as transmembrane links be- 
tween the extracellular environment and the cytoskeleton. 
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Three subfamilies of integrins are defined by their distinct & 
subunits. The ft subunit is common to the fibronectin 
receptor (FNR) and some antigens appearing very late in leu- 
kocyte activation whereas ft is common to platelet glyco- 
protein Ilb/IIIa (gpIIb/IIIa) and the vitronectin receptor, ft 
and ft integrins are all ECM receptors and are involved in 
cell-substrate adhesion, matrix assembly, regulation of cell 
growth, differentiation, and localization during morphogen- 
esis, and wound healing. We will refer to the ft and ft fam- 
ily as the extracellular matrix receptor integrins. The ft 
subunit is common to LFA-1, Mac-1, and pl50,95, whose ex- 
pression is limited to leukocytes, and we designate these as 
the leukocyte integrins. 

The ligand of LFA-1 is an inducible cell surface glycopro- 
tein, intercellular adhesion molecule-1 (ICAM-1) (Af r = 
90,000), which is found on cells of many lineages including 
leukocytes, endothelial cells, fibroblasts, and epidermal ker- 
atinocytes (14, 29, 40, 46, 49). ICAM-1 mRNA and surfece 
expression is induced by inflammatory mediators including 
interferon-gamma, interleukin-1, tumor necrosis factor, and 
lipopolysaccharide (36, 46, 49); thus, it may regulate cell in- 
teraction and localization in inflammation. LFA-l-dependent 
adhesion of cells to planar lipid membranes containing 
ICAM-1 requires metabolic energy, a functional cytoskele- 
ton, and divalent cations (29). Cell activation can enhance 
LFA-l-dependent adhesion without any effect on LFA-1 or 
ICAM-1 surfece expression as shown by phorbol ester-in- 
duced homotypic adhesion of B, T, and monocytic cells (39). 
ICAM-1 is a member of the immunoglobulin superfamily 
and consists of five immunoglobulin constant region-like do- 
mains. In contrast to most ligands of the ECM receptor inte- 
grins which contain the critical recognition sequence argi- 
nine-glycine-aspartic acid (RGD), ICAM-1 does not contain 
an RGD sequence (46, 49). The LFA-l-ICAM-1 receptor- 
ligand pair is thus far the only known example of a member 
of the integrin superfamily interacting with a member of the 
immunoglobulin superfamily. 

We have been interested in the structural basis for the im- 
portant function of LFA-1 in inflammation and the immune 
response. Furthermore, we wished to define the relationship 
of LFA-1 to other leukocyte integrins and to the ECM recep- 
tor integrins. The lack of an RGD sequence in ICAM-1 raised 
the question of whether the LFA-1 a subunit has structural 
features typical of ECM integrins or whether novel features 
consistent with the lack of RGD recognition exist. Therefore, 
we have characterized the LFA-1 or subunit at the protein and 
mRNA level and have determined its complete sequence. 
The amino acid sequence demonstrates that LFA-1 is an inte- 
gral membrane protein containing an extracellular domain, 
a single hydrophobic transmembrane domain, and a cyto- 
plasmic tail. The extracellular domain has seven repeats; the 
three repeats located most COOH-terminal contain putative 
divalent cation binding sites. LFA-1 has significant homology 
with the other members of the integrin superfamily. A region 
near the NH 2 terminus of the molecule contains an inser- 
tion of ~200 amino acids similar to pl50,95 (9) and Mac-1 
(4, 7, 38). This domain has significant homology to the type 
A domains of von Willebrand factor (vWF), complement 
factor B, and the repeats of chicken cartilage matrix protein 
(CMP). These similarities suggest relevant functional do- 
mains within the LFA-1 ct subunit as well as novel evolution- 
ary relationships. 



Materials and Methods 
Protein Purification 

The mAb, TS1/22, which is directed against the LFA-1 a subunit, was 
purified and coupled using cyanogen bromide to CL-4B Sepharose at 2 mg 
mAb per ml of packed bed. SKW3 cells (42.2 g) were lysed in 300 ml of 
lysis buffer (25) and the lysate was spun at 5,000 g, the pellet discarded, 
and then spun at 16\000 g for 2 h. The supernatant was then sequentially 
passed through a precolumn of activated and quenched CL4B Sepharose 
and then a TS1/22 mAb Sepharose column. The TS1/22 column was washed 
sequentially (25), and the LFA-1 molecule was eluted with 0.5 M NaCl, 
0.1% Triton X-100, 1 mM iodoacetamide, 10 U/ml aprotinin, and 0.025% 
NaN 3 , 50 mM trtethylamine, pH 11.5, and the pH immediately neutral- 
ized. The fractions containing LFA-1 were pooled, lyophilized, and precipi- 
tated in 5 vol ethanol at -20°C overnight. 

Purified protein was reduced and alkylated (23) and subjected to prepara- 
tive SDS-PAGE. The band corresponding to the a subunit was visualized 
with 1 M KC1, excised, and electroeluted (20). The purified a subunit was 
lyophilized and precipitated in 4 vol ethanol at -20°C overnight. The pellet 
was resuspended and digested with 1% (wt/wt) trypsin (23). The tryptic 
fragments were then isolated by HPLC (Bcckman Instruments Inc., Palo 
Alto, CA) on a C4 reverse phase column (Vydac, Hesperia, CA). The pep- 
tides were eluted on a 0-60% acetonitrile gradient in 1% trifluoroacetic 
acid. Several peaks were rechromatographed isocratically in a concentration 
of acetonitrile determined by the equation, F = 0.9E-2, where F is the vol- 
ume percentage of acetonitrile under isocratic conditions for a peptide that 
eluted at E percent during the linear gradient (57). Peaks were collected in 
1.5-ml polypropylene tubes and concentrated to <50 pi. Eight peaks were 
subjected to microsequencing. The sequence of one peptide, L64, was used 
to synthesize a single sequence oligonucleotide (5-GGGATGTTGTGGT- 
C ATGGATGGTGGGCTC A AT-30 according to suggestions of Lathe (26) . 

cDNA Cloning, Restriction Mapping, and 
Nucleotide Sequencing 

The production and screening of the cDNA library was performed as previ- 
ously described (9). Restriction maps of the selected clones were deter- 
mined by double and partial digests (28). Restriction fragments were sub- 
cloned into either M13mpl8 and mpl9 (31) or pGEM-3Z. 4Z, or 7Z 
(Promega Biotec, Madison, WI) . Deletions of the fragments in pGEM were 
made using Exonuclease III and SI nuclease (18). Sequencing was by the 
dideoxy termination method (43). The coding, 5'-untranslated and 3'- 
untranslated regions were determined 100, 100, and 36.7% in both orienta- 
tions, respectively. 

Southern and Northern Blot 

Southern blots were performed as described elsewhere (8). For Northern 
blots, 10 Mg of poly (A+) RNA isolated from SKW3, U937, IB4, or EJ cells 
was subjected to electrophoresis on a 1.0% formaldehyde gel and transferred 
to nitrocellulose (5). The nitrocellulose (Bio-Rad Laboratories, Richmond, 
CA) was prehybridized and hybridized in 2x SSC, lx Denhardt's solution, 
0.1% SDS, and 10 /ig/ml of herring sperm DNA. A 1.8-kb Eco RI probe 
from the 5' end of the cDNA clone, X3R1 (Fig. 2), was labeled by nick trans- 
lation and used as probe. 

Computer Analysis 

Homology searches and alignment of sequences used the Microgenie DNA 
program (Beckman Instruments Inc.), FASTP (56) on the NBRF and NEW 
databases (National Biomedical Research Foundation, Washington, DC), 
and FASTP using the SWISS-PROT data base (Bionet Intelligcnetics, 
Mountainview, CA). These alignments were then optimized, and the per- 
cent identity and statistical significance were determined using ALIGN 
(NBRF) (11). 

Hydrophobicity was determined according to Hopp and Woods using the 
Microgenie DNA program (19). 

Results and Discussion 

Protein Purification and Peptide Sequence 

LFA-1 was solubilized from SKW3, a T lymphoma cell line, 
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Figure 1. Protein purification of LFA-1 a 
subunit. Silver-stained SDS-PAGE gel show- 
ing a column fraction of the affinity-purified 
protein (lane 1 ) and the electroeluted or sub- 
unit (lane 2). 



and isolated by mAb affinity chromatography using an anti- 
body directed against the a subunit. SDS-PAGE showed the 
a and 0 subunits (Fig. 1, lane / ). The a subunit was further 
purified by preparative SDS-PAGE (Fig. 1, lane 2) and the 
purified a subunit was digested with trypsin and the peptides 
were isolated by reverse phase HPLC. The sequence of nine 
peptides was determined by microsequencing (Table I) and 
the sequence possessing the lowest codon redundancy was 
used to specify a single oligonucleotide sequence which used 
the most commonly occurring human codons (26). 

cDNA Cloning and Characterization 

The 32-mer oligonucleotide probe was used to isolate 20 
clones from a size-selected XgtlO cDNA library constructed 
from PMA-stimulated myeloid cells (9). These cells have 
been previously shown to synthesize the LFA-1 a subunit 



■VC31UUES 




95-107 


XXDQ(N)TYLSGL(E)YLF 


199-210 


HMLLLTNTFGAI 


254-260 


YIIGIGK 


405-413 


VLLFQEXQG 


494-503 


GE AIT ALTXI 


541-554 


IEGTQVLSGIQXFG 


564-576 


X (L) E (G) D (V/G) LADVAVGAE 


803-817 


K VEMLKPHSEIX VS 


929-946 


I (E/Q) PSIHDHNIPXLEAVXG 



Parentheses indicate ambiguity in the sequence. The underlined residues were 
used to generate an oligonucleotide probe. 



(32). The insert size was determined and the longest clone, 
X5L5, was restriction mapped, and sequenced (Fig. 2). This 
clone contained the nucleotide sequence corresponding to 
the oligonucleotide probe (85% identity to the "best-guess" 
probe) and agreed perfectly with the tryptic peptide se- 
quence (16 of 16 residues). However, this clone did not en- 
code the entire protein since not all of the tryptic peptide se- 
quences were present in the open reading frame. The 5' 
1.0-kb Eco RI fragment of XSL5 was used to select 14 addi- 
tional clones. The clone X3R1 had an identical restriction 
map in the overlapping regions and contained an additional 
1.0-kb 5' fragment (Fig. 2). 

The composite sequence of X5L5 and X3R1 contains 5,139 
nucleotides (Fig. 3). There is an open reading frame of 3,510 
nucleotides, a 5' untranslated region of 94 nucleotides, and 
a 3' untranslated region of 1,535 nucleotides which contains 
a polyadenylation signal site 15 nucleotides before the 
poly(A+) tail. Within the 3' untranslated region there is a 
typical Alul repeat consisting of two tandem related se- 
quences each terminated by an A-rich segment (16). 

Northern blots demonstrate an LFA-1 a subunit mRNA of 
5.5 kb in SKW3 T lymphoma cells, U937 myelomonocytic 
cells, and IB4 B lymphoblastoid cells (Fig. 4 A y lanes 1-3); 
however, no signal was detected in EJ bladder carcinoma 
cells (Fig, 4 A, lane 4). The mRNA expression is in agree- 
ment with the restriction of cell surface expression of LFA-1 
to hematopoietic cells. The 0 subunit showed the same pat- 
tern of expression (Fig. 4 B). In Southern blots, the 1.2-kb 
5' Eco RI-Bam HI fragment from clone X2L2 hybridized to 
two fragments of 10 and 8 kb (8). A genomic clone isolated 
from a cosmid library possesses the same two fragments 



1.0 kb 



B1P ScP NC BPB P PR P SP BaPPBa 



X3R1 



X5L5 



Figure 2. Restriction map of the LFA-1 a subunit 
cDNA clones. The open reading frame and un- 
translated regions are indicated with a thick and 
thin line, respectively. Arrows indicating the se- 
quencing strategy are shown. The relevant restric- 
tion sites are Bal I (0/), Bam HI (Ba) y Bgl II (0), 
Cla I (C), Eco RI (R), Nru I (N) f Pst I (/>), Sea 
I (5c), Sma (5), and Sph I (Sp). 
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ga aiiuj^ muuxauii iw cgttxc^^ m 

ATG AAG GAT TCC TX ATC ACT GTG ATG GX ATG OCG CTG CTG TCT GX TIC TTT TIC TIC GX CCS OX IDS AGC TAC AAC CIC GAC GTC CDC GX GX CDC AX TIC TCC 205 
MKgSC.ITVflAflAfcfrSC.rFFrftPA.SS YNLDVRGARSFS 

ccAaBcxGxcxAxcAcmoGATJiccxGxcnGc^ aacaqc aca cga ax ctc tat cas tx cag 31* 

13 PPRACRHrCYRVLQVCNCVIVCAPCEC Ih 3 tI C S L Y 0 C Q 

laSt^X^aGAaCIKCnBOCAGTCAXCIGAACCTTOC AAC TAT atr TCC AAG TJC TIC GGA ATG AX TIC CCA ACA OC CCC ACA GAT GGA AGC ATT TX OOC TOT 427 
SO SGTGBCLPVTLRG S IH T Tl SKYLGMTLATD P TOGS I LAC 

GAC OCT GOG CIS TCT CGA AOS TOT GAC CAS AAC ADC TAT CIC AST GGC CVS TGT TAC CIC TIC OOC CAS AAT CIC CAG GET OOC ATG CIC CAS GOG OCC OCT GGT TTT CAS 531 
87 D P G L 5 R TCDOMTYLSGLCYLF RQNLQGPMLOGRPGFO 

GAA TGT ATC AAG GX AAC GTA GAC CTG GTA TTT CIC TTT GAT GGT TOG A3G AX TIC CAG CCA GAT GAA TTT CAG AAA ATT CTG GAC TIC AEG AAG GAT CTG ATC AAC AAA 649 
124 EC IKGNVDLVFLFDGSMSLQPDErOKILD FMKOVMKK 

CTC AX ABC ACT ICG TAC CAG TTT OCT OCT GTT CAG TTT TOC ACA AGC TJC AAA ACA GM TTT GAT TIC TCA GAT TAT GTT AAA TOG AAG GAC OCT GAT OCT CIG CIC AAG 760 
161 L 5 IN T SI YQFAAVOFSTSYKTEFOFSDYVKWKDPOALLX 

CAT GTA AAG CAC ATC TTG CTG TTG ACC AAT ACC TTT GST OCC A1C AAT TAT CTC OCC ACA GAG GTG TIC OX GAS GAG CTC GX OOC COS CCA GAT GDC ACC AAA GTG CTT 871 
198 H V K HMLLLTHTTCAI NYVATEVFREELGARPDATKVL 

ATC ATC ATC ACC GAT GX GAG GDC ACT GAC AST GX AAC ATC GAT GOG GCC AAA GAC A1C ATC OOC TAC ATC ATC GGC ATT GSA AAG CAT TTT CAG AX AAG GAG ACT CAG 982 
235 IIITDGEATDSGNIDAAKDIIR Y I I G I G K HFQTKESO 

GAG AX CTC CAC AAA TTT OCA TCA AAA OOC GOG AX GAG TTT GTG AAA ATT CTG GAC ACA TTT GAG AAG CTG AAA GAT CIA TIC ACT GAG CTG CAS AAG AAG ATC TAT GTC 1093 
272 ETLHKFASKPASEFVKILDTFEKLKDtFTELQKKIYV 

ATT GAG GX ACA AX AAA CAG GAC CTG ACT TOC TTC AAC ATG GAS CTG TOC TOC AGC GX ATC AGT XT GAC CTC AX AX GX CAT OCA GTC GTG OX GCA GTA GGA GCC 1204 
309 IEGTSKQD LTSFHME LS5SG I SADLSRG HAVVGAVGA 

AAG GAC TOO XT GOG GX TTT CTT GAC CTG AAG GCA GAC CTG CAC GAT GAC ACA TTT ATT GX AAT GAA CCA TTC ACA CCA GAA GTG AGA XA GGC TAT TTG GGT TAC ACC 1315 
346 KDHAGGFLOLKAD LQD DTF IGNEPLTPE VRAGYLG YT 



383 
420 
4S7 
494 
S31 
568 



679 
716 
753 
790 
827 
864 
901 
938 
975 



GTG ACC TOG CTG CCC TOC OX CAA AAG ACT TOG TTG CTG GCC TOG GSA GCC OCT CGA TAC CAG CAC ATG GX CGA 
VTHLP SRQKTSLLASGAPRTQHMGR 

TOG AX CAG GTC CAS ACA ATC GAT GSG AX CAG ATT GX TCT TAT TTC GGT GX GAG CTG TGT GX GTC GAC GTG 
HSQVQTI HGTQIGSYFGGELCGVDV 

GX CCA CTG TTC TAT GX GAS CAG AGA GSA GX COG GTG TTT ATC TAC CAG ASA AGA CAG TTG GOG TTT GAA GAA 
APLFYGEQRGGRVF I YQRRQLGFEE 

GX COS TTT GSA GAA GX ATC ACT OCT CTG ACA GAC ATC AAC GX GAT GX CTG GTA GAC GTG OCT GTG GX GX 
G R FGEAITALTDI NGDGLVDVAVGA 

GX AX CAC GX GQG CTT AGT OX CAG CCA AST CAG OX ATA GAA GG5 AX CAA GTG CTC TCA GSA ATT CAG TOG 
GRHGGLSPQPSQ RIEGTQVLSGIQW 



GTG CTG CTG TTC CAA GAG CCA CAG GX GGA GSA CAC 
VLLFQEPQG G G 8 

GAC CAA GAT GOG GAG ACA GAS CTC CTG CTC ATT GST 
DQDGETELLLIG 

GTC TCA GAS CTC CAS GX GAC OX GX TJC CCA CTC 
VSELQGOPGYPL 

OCT CTG GAS GAG CAS XG OCT GTG TAC ATC TIC AAT 
PLEEOGAVYIFN 

TIT GCA OX TOC ATC CAT GOS CTG AAG GAC CTT GAA 
FGRS IBGVK D L E 



GX GAT GX CTC GCA GAT GTG XT GTG GX OCT GAG AX CAS ATG ATC GTC CTG AX ICC OX OX GIG GTG GAT 
GDGLADVAVGAE SQMIVLSSRPVVD 

CCA GTG CAT GAA GTG GAS TOC TOC TAT TCA ACC AST AAC AAG ATC AAA GAA GGA CT T AAT ATC ACA ATC TCT TTC 
PVHEVECSYSTSNKMKEC V j N I H I C F 

CTG GTT GX AAT CTC ACT TAC ACT CTG CAS CTG GAT GX CAC COG ACC ASA ASA COG GX TTG TTC GCA GSA GX 
L V A IN L Tl YTLQLDGHRTRRRGLFPGG 

AX ATC TCA TX ACT GAC TTC TCA TTT CAT TTC COS GTA TGT GTT CAA GAC CTC ATC TOC OX ATC AAT GTT TCC 
SMSCTOFSFHFPVCVQDLISPI | W V SJ 

AXGACCMAXGXCAGGXAAGGACATACCGCXATCCTGAGACXTO 
RDQRAOGKDIPPILRPSLHSETNE I 

GAG OCA AAC TTG AGA GTG TCC TTC TCT CCT GCA ASA TCC ASA GX CTG OCT CIA ACT OCT TTT GCC AX CTC TCT 
EANLRVSF SPARSRALRLTAFASLS 

TAC TGS GTC CAG CTG GAC CTG CAC TTC GX COG GGA CTC TCC TTC OX AAG GTG GAG ATG CTG AAG GX CAT AX 
YHVQL0LHFPPGL5FR KVEHLKPHS 



ATC GTC ACC CTG ATG TCC TTC TCT CCA XT GAS ATC 
MVTLMSFSPAEX 

CAG ATC AAG TCT CTC TAC CCC CAG TTC CAA GX OX 
QIKSLYPQFQGR 

ASA CAT GAA CTC AGA AX AAT ATA OCT GTC AX ACC 
RHELRRNIAVTT 

CTG AAT TTC TCT CTT TOG GAS GAS GAA GGG ACA OCG 
L IMF S| LWEEEGTP 

OCT TTT GAG AAG AAC TGT GX GAG GAC MG A/G TGT 
PFEKNCGEDKKC 

GTG GAG CTG AGC CTG AST AAC TTC GAA CM GAT OCT 
VELSLSNLEEDA 

CAS ATA CCT GTG AX TOC GAS GAS CTT OCT GAA GAS 
QIPVSCEELPEE 



TCC AX CTT CTG TX AX GCA TEA. TCT TOC AAT GTC AX TCT CCC ATC TTC AAA GCA OX CAC TX GTT OCT CTC 
SRLLSRALSC IK V S| SP IFKAGHSVAL 

GX GAC TX GTT GAA TTG CAC OX AAT GTG AX TGT AAC AAT GAG GAC TCA CAC CTC CTG GAG GAC AAC TCA XC 
GDSVELHA fW V Tl CNNBDSDLLEDNSA 

CTC ATC CAS GAC CAA GAA GAC TCC ACA CTC TAT GTC AST TTC AX OX AAA GX CCC AAG ATC CAC CAA GTC AAG 
LIQDQEDSTLYVSFTPKGPKIHQVK 

GAC CAC AAC ATA OX AX CTG GAG OCT CTG GTT OX GTG CCA CAG CCT OX AX GAG GX XC ATC ACA CAC CAG 
DHNIPT LBAVVG VPQPPSEGP ITHQ 



CAS ATG ATG TTT AAT ACA CTG GTA 
QMMFNTLV 



AAC AX TCC TOG 
IN S S| H 



TAT GAS GAT CTC GAG AX CTC OCG GAT XA OCT GAG OCT TCT CTC OX GSA GX CTG TTC OX TX CCT GTT GTC 
YEDLERLPOAAEPCLPGALFRCPVV 



ACT AX ATC AIC OX ATC CTG TJC OX ATC AAC ATC 
TTIIPILYPINI 

CAC ATC TAC CAG GTG AX ATC CAG OCT TOC ATC CAC 

H M Y Q V R I Q P S I H 

TX AGC GTG CAS ATG GAS OCT OX GTG OX TX CAC 
HSVQMBPPVPCH 

TTC AGS CAS GAG AX CTC GTC CAA GTG ATC GX ACT 
FRQE ILVQVIGT 



1426 
1537 
1648 
1759 
1872 
1981 
2092 
2203 
2314 
2425 
2S36 
2647 
2758 
2869 
2980 
3091 
3202 



D 

O 

2. 
o 

Q) 
CD- 
CD 
CL 

? 

3 



b 
o 

C_ 
C 

*< 
OJ 

O 
O 



CTG GAS CTG GTG GCA GAG ATC GAC GX TCT TCC ATC TTC AX CTC TX AX TCC CTC TX ATC TX TTC AAC AX AX AAG CAT TTC CAC CTC TAT OX AX AAC XC TOC 3313 
1012 LELVGE IEAS5MFSLCSSLSXSF N S SI KHFHLYGS IN A 31 

CTC GX CAC GTT GTC ATO AAG GTT GAC CTG GTG TAT GAG AAC CAC ATG CTC TAC CTC TAC GTC CTC AX OX ATC GX OX CTG CTC CIC CTC CTG CTC ATT TTC ATA GTC 3424 
1049 LAQVVMKVDVVYE K QMLYLYfLSGIGGLLLLLLIF IV 

CTG TAC AAG GTT GGT TTC TTC AAA OX AAC CTG AAG GAG AAG ATG GAS OCT OX AGA GOT GTC COG AAT GGA AX XT OCA GAA GAC TCT GAG CAG CTG XA TCT GX CM 3535 

1086 L Jf K C jT F KRNLKE KMEAGRGVPNG IPAE D©EQLA©GQ 

gag oct ox gat cdc ggc toc ctg aag xc ctc cat gag aag gac tct gag ast ggt ggt gx aag gac tcjctxaxctgtgaogtgcasastx 3660 
1023 eacdpgclkplhekd€)e©gggkd 

CC A.nUALUt ' I GXTCCATTC mj^ 3808 

AACASSXTOCAAAACTCA aJULTltfl^ ^ 3956 

r^rrrtmr^trrr^frrr^^ 4104 

TnrrTcrr muuL Tim renorrT XA AA ATG A UAU i^ i uuultiuuulauxtlTII lAa ccccTGA auriiiriuri i r 'AStrr A UAriu/ntriuium, iikA rrcAX 4252 
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12 3 4 



B 

12 3 4 



28S - 



18S - 




Figure 4. (A) Northern blot anal- 
ysis of 10 /*g of poly (A+) se- 
lected RNA from SKW3 (lane 1 ), 
U937 (lane 2), IB4 (lane J), and 
EJ cells (lane 4). (B) Identical 
blot reprobed with the /3 subunit 
cDNA probe (23). The position 
of the 28 and 18 S rRNA is indi- 
cated. 



which are contiguous and hybridize with different regions of 
the cDNA (unpublished data) showing that the LFA-1 a 
subunit is a single copy gene. 

Protein Sequence 

All 10S amino acids determined by microsequencing of tryp- 
tic peptides were found in the translated open reading frame, 
confirming the authenticity of the cDNA clones (Fig. 3). 
Hydrophobicity analysis shows that the LFA-1 a subunit is 
a typical transmembrane protein with a 25-residue hydro- 
phobic signal sequence, an extracellular domain of 1,063 
residues, a single hydrophobic transmembrane region of 29 
residues, and a short cytoplasmic tail of 53 residues. The 
NH 2 -terminal residue of human LFA-1 was identified by ho- 
mology to the NH 2 -terminal sequence of murine LFA-1 (48) 
(Fig. 6, see below). Human LFA-1 has 55 % identity with mu- 
rine LFA-1 over the first 20 amino acids. A classical signal 
peptide with a consensus sequence (Ala-X-Ser/Pro) for the 
cleavage peptidase precedes the NH 2 -terminal sequence. 
There are three putative upstream transcription initiation 
sites (ATG) in frame. The use of the first initiation site 



2. These data sequence have been submitted to the EMBL/GenBank Data 
Libraries under the accession number Y0O796. 



(nucleotide position 89) is generally favored (24) and gives 
a 25-residue signal sequence with several NH 2 -terminal 
polar groups as is typically found in signal sequences (54). 

The mature protein is Af r = 126,193. 12 N-linked glyco- 
sylation sites (Asn-X-Thr/Ser) are present in the extracellular 
domain. These findings are consistent with the size previ- 
ously determined by SDS-PAGE for the in vitro translated 
murine LFA-1 a subunit (Af r = 140,000), the murine and 
human LFA-1 a subunit glycoproteins (M T = 180,000 and 
177,000) (44), and with previous studies on the glycosylation 
of LFA-1 (10, 32). 

Within the extracellular domain there are seven internal 
repeats (Fig. 5). The degree of identity is highest among the 
three repeats (14.5-33%) located toward the COOH termi- 
nus which show a statistically significant relationship (P < 
10" 2 - < 10 -6 ). The relationship among the four repeats lo- 
cated toward the NH 2 terminus is weaker and is discernible 
by conservation of flanking sequences (Fig. 5); the homol- 
ogy between repeat IV and V is significant (P < lO^ 4 ). The 
central regions of the three COOH-terminal repeats is simi- 
lar with the EF hand divalent cation binding site motif, per- 
haps due to convergent evolution (Fig. 5). Previous studies 
have shown that Mg 2+ alone or at lower concentration in 
conjunction with Ca 2+ is necessary for ligand binding func- 



Figure J. Nucleotide sequence and derived protein sequence of the a subunit of LFA-1. 2 The potential glycosylation sites are boxed. The 
sequences of the tryptic peptides and the transmembrane region are underlined with a thick and shaded line, respectively. The putative 
serine phosporylation sites are circled. The nucleotides in the 3' untranslated region that are underlined correspond to an Alul sequence 
and the polyadenylation site is boxed. 
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Figure 5. Alignment of the internal repeats of the LFA-1 a subunit. The consensus flanking sequence is shown above the aligned repeats. 
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tin receptor [50], fibronectin receptor [3], and platelet glycoprotein lib [37]). The calcium and magnesium binding sites of parvalbumin, 
troponin C (52), and galactose binding protein (55) are aligned as shown. The asterisk indicates that more than one oxygen is involved 
in cation binding. 



tion (29) (Dustin, M. L., and T. Springer unpublished 
results), suggesting that these sites bind divalent cations. 

Distinct Subfamilies of Integrin a Subunits 

We compared the LFA-1 a subunit to other integrin a 
subunits (Fig. 6). The LFA-1 a subunit has striking and con- 
sistently higher homology to other leukocyte integrin a 
subunits (Mac-1, 35.7%; pl50,95, 37.4%) than to ECM re- 
ceptor integrin a subunits (vitronectin receptor, 26.4%; 
FNR, 27.8%; gpllb 30.2%). The ECM receptor integrin a 
subunits are more related to one another (x = 41.9% SD = 
3.7%) than to the leukocyte integrins (Fig. 8, see below). 
Further structural features distinguish the leukocyte and 
ECM receptor integrins. The leukocyte integrins contain an 
insertion of ~200 amino acids near the NH 2 -terminal re- 
gion of the protein that is not present in the three sequenced 
ECM receptor integrins (Fig. 6). Furthermore, a region con- 
taining dibasic amino acid protease cleavage sites in vitro- 
nectin receptor, FNR, and gpllb/nia (residues 853-871) is 
absent from LFA-1 as well as pl50,95 and Mac-1 a subunits, 
correlating with the lack of proteolytic processing of the leu- 
kocyte integrin a subunits (42). As a whole, these structural 
features define two subfamilies of the a subunit integrins, the 
leukocyte integrins and the ECM receptor integrins. 

All the integrin a subunits thus far sequenced have tandem 
repeats similar to those of LFA-1 (3, 9, 37, 41, 51) (Fig. 5). 
Putative divalent cation binding sites are present in repeats 
V-VII in all the integrin a subunits and in repeat IV only in 
the ECM receptor integrins. These sites are similar to the EF 
hand divalent cation binding sites located in the turn between 
two a helices in troponin C, parvalbulmin (52), and galac- 
tose binding protein (55) among others. In the EF hand loop 
the metal is chelated to six oxygen-containing side groups 
spaced every second or third residue. The putative divalent 
cation binding sites on LFA-1 and the other integrins have a 
similar primary structure with the putative chelating resi- 
dues and intervening glycine residues being conserved (Fig. 
5). However, the residue in the -Z position is replaced by 
a hydrophobic residue. 



An Inserted Domain in the Leukocyte Integrins 

Searches of the NBRF and SWISS-PROT protein sequence 
data banks revealed that the LFA-1 domain of ~200 amino 
acids, which is not present in the sequenced ECM integrins, 
is homologous to the domains of the same size in vWF and 
a cartilage matrix protein (Figs. 7 and 8). These alignments 
have 20.4-32.1% identity and are statistically significant (P 

< 10 -9 - < 10" 23 ). Of the integrins sequenced to date, this 
200-amino acid domain is unique to the leukocyte integrins. 
Homologous domains are inserted in several proteins and in 
well-studied examples have been documented to mediate in- 
teraction with ligand (see below). For this reason we will re- 
fer to the 200-residue region of the LFA-1, Mac-1, and 
pl50,95 a subunits as the T (inserted/interactive) domain. 
The homology unit is present in three tandem repeats in vWF 
(A domains) (45) (P < 10" 15 - < 10 -23 for comparison of the 
three repeats to one another) and two repeats separated by 
an EGF domain in CMP (2) (P < 10~ 23 ). With the exception 
of the NH 2 -terminal region of the CMP repeat 1 which has 
not been sequenced, the repeating units in both vWF and 
CMP correspond precisely to the region homologous to the 
I domain, supporting the concept that this homology unit is 
a domain. Similar structural homologies have been noted for 
the murine and human Mac-1 a subunit (7, 38) and the de- 
gree of homology of Mac-1 and pi 50,95 with vWF and CMP 
is similar to that found with LFA-1 (17.8-31.6% identity, P 

< 10^- < 10~ 23 ). Factor B has previously been found to be 
homologous with the vWF A repeats (P < 10" 4 - < 10" 7 ) 
(45); factor B in turn is homologous but at a lower level with 
the LFA-1, Mac-1, and pl50,95 I domains (P < 10 2 - < 10" 4 ) 
(7, 38). In factor B a single homology unit is bounded on one 
side by the site for the cleavage which activates the B zymo- 
gen to the active Bb fragment, and on the other side by the 
serine protease domain (6). 

Based on these homologies, we propose the following 
pathway for leukocyte integrin evolution. A primordial gene 
duplicated and gave rise to at least two branches of integrin 
a subunits, the leukocyte integrins and the ECM integrin 
receptors. The percent identity among the I domains is very 
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Figure 6. Alignment and comparison of the human LFA-1 a subunit with the other members of the integrin superfamily (Mac-1 [4, 7], 
pl50,95 [9J, vitronectin receptor [50], fibronectin receptor [3], and platelet glycoprotein lib [37] and the NH 2 terminus of the murine 
LFA-1 a subunit [48]. The residues common to LFA-1 and at least one other integrin are boxed. The area of the I domain is shown in 
Fig. 7. The protease cleavage site in the ECM receptor a subunits are indicated with black dots. 
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Figure 7 Alignment and comparison of the LFA-1 a subunit I domain with the homologous domains in Mac-1 (4,7), pi 50,95 (9), vWF 
(45), factor B (34), and CMP (3). 
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similar to the overall percent identity among the leukocyte 
integrins, implying that a single I domain incorporated into 
a primordial leukocyte integrin only once rather than in- 
dependently. Then, the gene duplicated and gave rise to LFA- 
1 and a Mac-l/pl50,95 primordial gene. Further duplication 
of the Mac-l/pl50,95 primordial gene gave rise to Mac-1 and 
pl50,95. This scheme is consistent with the observation that 
Mac-1 and pi 50,95 are more closely related to each other 
(61.0%) than to LFA-1 (35.7, 37.4%, respectively). Since 
LFA-1, Mac-1, and pl50,95 a subunits are located on chro- 
mosome 16 band pll (8), these at subunits genes remained 
in close proximity. ECM a subunits appear to be a separate 
branch since they are more closely related to each other than 
to the leukocyte integrins even ignoring the I domain (Fig. 
8). Although three sequenced a subunits of ECM integrins 
bind to two different jS subunits, the two a subunits as- 
sociated with the ft subunit are not more closely related to 
one another than to the a subunit associated with the ft 
subunit. Thus, although there are three distinct 0 subunits, 
only two subfamilies of a subunits can be distinguished by 
sequence homology considerations. The structural proper- 
ties of the leukocyte integrin a subunits may not be unique 
to their subfamily since some VLA integrin a subunits may 
have a homologous I domain and lack a proteolytic cleavage 
site (Hemler, M. personal communication). 

The I domain appears to be incorporated in proteins which 
are otherwise structurally distinct. These structurally simi- 



lar domains form an important protein super-family which we 
will designate the I domain superfamily. Repeats within vWF 
and within CMP are internally more related than to the do- 
mains in other proteins (Fig. 8). Thus, it appears a single do- 
main was incorporated and then duplicated or triplicated. 
Members of the I domain superfamily serve important recog- 
nition functions in several proteins. The Al domain of vWF 
binds to glycoprotein lb and heparin (15) while both the Al 
and A3 domains are involved in binding to collagen. The do- 
main in factor B is located in a region of the molecule avail- 
able for interaction with its ligand C3b (6, 34). The domain 
in CMP may also have an important role in interaction with 
collagen (2) and cartilage proteoglycan (35). These domains 
lack glycosylation and cysteines for the most part. Similarly, 
the I domain and the repeats with divalent cation binding 
sites in LFA-1 contain only one N-glycosylation site and cys- 
teine (Fig. 8). Similar to the Mac-1 (7, 38) and pl5Q,95 (9) 
a subunits, the N-glycosylation sites and cysteines are lo- 
cated primarily NH 2 -terminal and COOH-terminal to this 
region. Therefore, the I domain would be accessible to ligand 
and capable of conformational changes which may be impor- 
tant in regulation of ligand binding (39). 

Structural differences between the ECM receptor integrin 
a subunits and LFA-1 correlate with differences in recogni- 
tion specificity; RGD containing peptides block the binding 
of the VNR, FNR, and gpHb/HIa to their ligands (41) but not 
binding of LFA-1 to ICAM-1 (29). Furthermore, ICAM-1 
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Figure 8. Schematic represen- 
tation of the evolutionary re- 
lationships of the domains ho* 
mologous to the I domain. 
Black lollipops (above sche- 
matic) and circles (below sche- 
matic) indicate the respective 
sites of potential glycosylation 
and cysteines within LFA-1. 
The percent identity and the 
standard deviations among the 
different integrin a subunits 
and members of the I domain 
superlamily are shown. Se- 
quences were compared pair- 
wise using the ALIGN pro- 
gram. The I domain sequences 
were as follows: LFA-1, Mac-1, 
and pl50,95 a subunits (130- 
327; 133-337; 131-335, re- 
spectively), the A repeats of 
von Willebrand factor (Al: 
513-717; A2: 734-922; A3: 
927-1115), fector B (245^50), 
and collagen matrix protein 
(repeat I: 1-167; repeat 2: 
195-393). Relevant probabil- 
ity values are shown in the 
text. 



does not contain an RGD sequence and unlike other known 
integrin ligands, is a member of the immunoglobulin gene 
superfarnily. It will be of interest to determine whether the I 
domain confers specificity for non-RGD containing ligands. 

This study demonstrates that the LFA-1 a subunit belongs 
to the integrin superfarnily but possesses an additional do- 
main. This I domain and homologous domains constitute a 
protein "domain" family that is of functional importance. 
Since LFA-1 is involved in a large number of leukocyte func- 
tions and may have more than one ligand (39) (Dustin, M. L. , 
and T. Springer, manuscript in preparation), it is possible the 
more than one functional domain exists in the LFA-1 a sub- 
unit. The availability of cDNA clones for the a and 0 sub- 
units will allow these and other structure-function relation- 
ships to be examined. 
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